Roger Grosse
(http://people.csail.mit.edu/rgrosse/)
MIT
Wednesday 31st October 2012
Time: 4pm
B10 Basement Seminar Room
Alexandra House, 17 Queen Square, London, WC1N 3AR
Model selection in a large compositional space
We often build complex probabilistic models by "composing" simpler models -- using one model to generate the latent variables for another model. This allows us to express complex distributions over the observed data and to share statistical structure between different parts of a model. I'll present a space of matrix decomposition models defined by the composition of a small number of motifs of probabilistic modeling, such as clustering, low rank factorizations, and binary latent factor models. This compositional structure can be represented by a context-free grammar whose production rules correspond to these motifs. By exploiting the structure of this grammar, we can generically and efficiently infer latent components and estimate predictive likelihood for nearly 2500 model structures using a small toolbox of reusable algorithms. Using a greedy search over this grammar, we automatically choose the decomposition structure from raw data by evaluating only a small fraction of all models. The proposed method typically finds the correct structure for synthetic data and backs off gracefully to simpler models under heavy noise. It learns sensible structures for datasets as diverse as image patches, motion capture, 20 Questions, and U.S. Senate votes, all using exactly the same code.